A Computer-Program System 


to Facilitate the Study of Technical Documents’ 


Symbiont is a computer-and-program system for use in 
research on computer-aided study. It stores, retrieves, 
and displays documents and parts of documents. It 
“semi-automates” the taking of verbatim notes. It facili- 


e Introduction 


The purpose of this paper is to describe a system, con- 
sisting of a digital computer ® and a computer program, 
intended for exploration of man-machine interaction and 
computer assistance to man in the study of technical 
documents. 

The system provides a physical study situation that 
includes a desk, an electric typewriter,’ a display screen, 
and a light-sensitive pointer or stylus (“light pen”).® 
The user of the system, whom we shall call “the student,” 
requests services and controls operations by typing com- 
mand characters or symbols on the typewriter or by 
touching illuminated areas of the display screen with 
the light pen. The computer and program system, which 
we cali “Symbiont” because we hope to develop it into 
a truly symbiotic partner of the student, displays infor- 
mation to the student via the typewriter or the display 
screen. The display screen, which is a 10-inch square 
area on the face of a cathode-ray tube, represents alpha- 
numeric symbols and graphs. Whenever part of a dis- 
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tates the manipulation and intercomparison of graphs. 
And it conducts searches for passages of text that con- 
tain specified words or phrases. Experience with Sym- 
biont and plans for its improvement are described. 
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played pattern is touched by the tip of the light pen, 
the computer can tell what part was touched and when. 
The combination of computer-controlled cathode-ray 
display and computer-signaling light pen is a convenient 
and flexible arrangement for man-computer communica- 
tion. 

Symbiont is an early stage of what we hope will be a 
continuing evolution. However, a sufficient set of func- 
tions has been implemented to lead us to take stock and 
gain experience in their use before modifying existing 
functions or adding new ones. 

Inasmuch as Symbiont is an exploratory tool, for use 
mainly by students who are at the same time experi- 
menters, we have not considered it necessary to perfect 
or polish. For example, the display flickers. With the aid 
of character-generation and display-buffering equipment, 
we could achieve a steady display: the technology is far 
enough advanced to fulfill the display function well. At 
present, however, the equipment required for flicker-free 
display is expensive, and we prefer to put available 
funds into other things. Our reasoning is that, in due 
course, good, steady displays will become relatively inex- 
pensive, and in the interim we can make allowances for 
a bit of flicker. The same argument applies to text 
storage capacity, text searching rate, and production of 
permanent copy. In short, our aim has been to realize 
several interesting functions now, even though in ways 
for which certain allowances have to be made, in order 
to gain early experience in using the functions and to 
provide a basis for practical system design when advances 
in technology make it possible to implement the func- 
tions effectively and economically. 


e Operations and Functions Implemented 


A study session with Symbiont starts with the com- 
puter turned on, the basic program running, and the 
text and graphs of several technical documents already 
punched into machine-readable paper tape. The text is 
represented character-by-character in a standard alpha- 
numeric code. The curves of the graphs are represented 
numerically by coordinate values at selected points along 
the abscissae, and the calibrations, labels, and legend are 
represented alphanumerically in a prescribed format. 

At the beginning of his study session, the student loads 
representations of the documents he plans to study from 
an input tape into the computer memory. Then, typi- 
cally, he calls for a document and reads or scans it. He 
calls for it by typing any part of its standard bibli- 
ographic citation that specifies it uniquely—the author’s 
name, for example, or a major part of the title, or the 
name (and perhaps volume or year) of the journal in 
which the document was published. Symbiont finds the 
specified document and presents the first screen-page of 
it. (A screen-page is about 150 words in length. Lines 
and pages have to be shorter on presently available dis- 
play screen than full lines and pages are in most docu- 
ment-pages.) The student turns pages in the forward 
direction by hitting the space bar of the typewriter. He 
may back up a page at a time by hitting the backspace 
key. 

While reading or scanning, the student comes upon a 
passage that he wants to record verbatim for future 
reference—a passage he would ordinarily copy onto a 
note card. With the aid of Symbiont, he records it on 
paper tape or in the note-file part of the computer 
memory. To punch it on paper tape, he touches the 
initial printed character or characters of the passage 
with the light pen and then types “b” (for “begin”). 
Underlining thereupon appears beneath the character(s) 
touched. Then he touches the final printed character(s) 
of the passage and types “e” (for “end”). Underlining 
thereupon appears beneath the ending of the passage, 
and immediately spreads back to the beginning. The 
passage is thus singled out for inspection by the student 
and for action by the computer. When the student types 
“p” (for “punch”), Symbiont punches the passage into 
paper tape. If the student next underlines the bibli- 
ographic-citation string that appears at the head of the 
document, Symbiont appends the citation to the note, 
thus handling a chore that ordinarily plagues the con- 
scientious notetaker when he takes his notes and the 
unconscientious notetaker when he tries to use his notes. 
The student can string any number of passages together 
by underlining them and punching them one at a time, 
in groups, or all at once. 

If the student prefers to note the passage in the com- 
puter memory instead of paper tape, he needs to specify 


a “tag” with which to retrieve it. He specifies the tag 
(before underlining the passage) by typing “t” (for 
“tag”) and then any symbol, or indeed any string of 
printing characters and spaces, terminated by a carriage 
return. He then underlines the passage and types “n” 
(for “note”). Alternatively, he can assign to the passage 
a “label,” which is functionally equivalent to a tag, but 
specified initially by underlining a string of characters 
on the screen with the light pen and then typing “1” (for 
“label”). The procedure for connecting the label to its 
passage is the same as the procedure for connecting a 
tag. Tags and labels go into a “glossary” of retrieval 
terms associated with the note file. To see what the 
glossary holds at any time, the student types “g” and 
looks at the screen. If the glossary is more than one 
page long, he turns its pages as though it were text. 

Often the student wants to retrieve notes, and some- 
times he wants to amend or combine them. To retrieve 
a note, the student types “r” (for “retrieve”) and then 
types the tag or label (or if more convenient, designates 
a corresponding string of characters by underlining them 
with the light pen). In amending and combining re- 
trieved notes, the student is constrained by the present 
system to serial designation and concatenation of passages 
and subpassages. Under these constraints, editing is like 
operating a switch engine. However, it will be easy to 
introduce the operations of deletion and insertion. 

Verbatim notetaking and retrieval of notes are ad- 
mittedly minor matters. More vital is retrieval of pri- 
mary information. In the present context, since the stu- 
dent is assumed to be working with a small collection of 
documents known to be relevant to the topic under inves- 
tigation, the retrieval problem is not primarily one of 
finding documents. It is primarily one of finding pas- 
sages in documents that discuss particular ideas, pas- 
sages that are relevant to particular technical points. 
The approach of Symbiont to this problem is to auto- 
mate the scanning of text for specified configurations of 
retrieval terms. 

Symbiont carries out searches with reference to one, 
two, or three sets of retrieval terms. Each set may con- 
tain any number of terms of any length. For retrieval 
purposes, all the members of a set are assumed to be 
synonymous: Symbiont considers that it has found the 
set as soon as it finds any member of a set. Symbiont 
looks for members of the three sets within a “neighbor- 
hood” of text. A neighborhood is n lines in length, and 
the student can set n to any value he likes. Five lines 
make a good neighborhood. 

Before conducting a search, the student types “t” (for 
“terms”), then types the strings of characters that con- 
stitute the alternate terms of the first retrieval set, and 
types “1” to designate this set as the first. Then the 
student types “t,” the terms of the second set, and “2,” 
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and finally “t,” the terms of the third set, and “3.” The 
three sets might be, for example: 


1 2 3 
cigarette lung cancer 
cigarettes lungs carcinoma 
cigar pulmonary 
cigars 
pipe 
pipes 
tobacco 
tobaccos 
nicotine 


The student then decides whether he wants a passage 
(neighborhood) dealing with one of the three, two of the 
three, or all three ideas (sets), and he initiates the search 
by typing “fl,” “f2,” or “f3” (for “find one,” etc.). 
Symbiont thereupon searches serially through the text 
until it either comes to the end or finds a neighborhood 
that meets the specifications. If it comes to the end, it 
displays “not found.” If it finds a neighborhood that 
meets the specifications, it displays on the screen the 
text containing the neighborhood, showing a small amount 
of preceding text and a larger amount of succeeding text. 
The student may turn pages, copy passages, etc., in the 
way described earlier, or he may type “f1,” “£2,” or “f3” 
and have Symbiont look for another passage that also 
meets the specification. 

Although the idea-retrieval technique just described is 
primitive, it is surprisingly effective if the student is 
clever in setting up the sets of terms. Typically, the 
student starts with a loose retrieval prescription and 
tightens it as he makes his way through his collection of 
documents. 

Graphs are composed by the computer from tabulated 
data and presented on the screen as graphs. They are 
displayed separately from text. They have keys that 
associate labels with curves; they have calibrated and 
labeled axes; and they have legends. Curves are approxi- 
mated by straight-line segments, dashed and/or dotted 
in eight patterns. A family of curves can have any num- 
ber of members, but in the present system, only one 
label. Up to eight families of curves can be superimposed 
upon one grid. Two grids can be set side-by-side to 
facilitate comparison. If the graphs are fundamentally 
comparable but different in scale factor, the student can, 
with the aid of the light pen, expand or compress the 
scales of one or the other until the two presentations are 
directly comparable. He adjusts the length or position 
of a line segment of the coordinate frame by touching 
one of its ends with the light pen (which “picks up” the 
end-point) and then moving the end-point to the desired 
location. If necessary, he repeats the procedure with the 
other end-point. The computer then rescales and relo- 
cates the entire graph. If two graphs are displayed side- 
by-side, one of them can be moved and superimposed 
upon the other, or curves can be transferred from one 
to another. These operations facilitate synthesis of a 
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composite picture from results obtained by diverse 
investigators. 

Symbiont makes it easy to modify not only the size 
of a graph but also the grid structure, the structure of 
the subdivision of the area within the graph. When it 
changes a grid, it also changes the numbers associated 
with the grid lines (i.e., the numbers associated with the 
scale-calibration points). 

At the bottom of the screen, there is a display of 
numerals and control symbols. By pointing with the 
light pen to individual numerals in proper sequence, the 
student can build up any number he needs. Then, desig- 
nating with the light pen the control symbol “SCALE” 
and a scale-calibration point he can substitute the 
assembled number for the number theretofore associated 
with the scale-calibration point. As soon as new num- 
bers have been associated with two calibration points 
on a linear axis scale, the computer substitutes new 
values at all the other calibration points on the axis. 

If he wants to change the number of grid lines that 
subdivide (say) the “pressure” scale of a graph, the stu- 
dent points with the light pen to the control symbol 
“GRID” and then to the label “PRESSURE” and then 
to the appropriate numeral corresponding to the desired 
number of grid lines. The computer immediately redraws 
the grid, leaving the extreme grid lines unchanged, and 
substitutes the appropriate new numbers near the inter- 
sections of the new grid lines and the horizontal axis. 
With these procedures, the student may experiment 
rapidly with various frames and grids, for he need 
specify only the essential parameters of each coordinate 
system. As soon as they are specified, Symbiont develops 
the detailed pattern. 


e Evaluations and Plans for Improvement 


Qur experience in using Symbiont has been limited by 
shortage of input tapes and by smallness of the computer 
memory. A semi-automatic tape-preparation subsystem 
and an arrangement for moving information automati- 
cally between primary (core) and secondary (drum) 
memory are the items of highest priority in the plans for 
Symbiont II. Even on the basis of the limited experience, 
however, it seems clear to us that the functions provided 
by Symbiont I (the system thus far implemented) are 
effective as aids in technical study. The function of 
searching for ideas, as primitive as the implementation 
is in Symbiont I, is little short of powerful. The automa- 
tion of verbatim notetaking, despite shortcomings in 
human engineering, seems capable of serving as the 
foundation for efficient personal documentation systems. 

In Symbiont I, however, too many of the graph- 
handling functions deal with frames, grids, and labels, 
and not enough deal with curves. The limitation to linear 
transformations is highly constraining. We must admit, 


therefore, that the graph-handling functions of Symbiont 
I do little more than (a) afford convenience in the few 
parts of the over-all process of graph manipulation that 
they subsume and (b) make it seem plausible that a 
fuller set of functions (involving perhaps 10 times as 
much programming) would be truly useful. 

The plans for Symbiont II call for the following modi- 
fications of, and additions to, Symbiont I: 


1. A subsystem to “‘semi-automate” preparation of 
input tapes of textual and graphical information. Be- 
cause performance of the system during study does not 
depend upon how the tapes were prepared, we deferred 
work of a tape-preparation subsystem and relied upon 
manual production of input tapes. Manual production 
proved not to be satisfactory. For Symbiont II, we plan 
to take text mainly from monotype and linotype tapes 
and to use computer film-reading techniques in convert- 
ing graphical data to tabular form. 

2. Extension of the storage areas, confined to core 
memory and supplementary paper tape in Symbiont I, 
to the magnetic drum (22 times 4,096 18-bit words) now 
associated with the PDP-1, and perhaps also from the 
drum to magnetic tape units. 

3. Substitution of light-pen for typewriter control of 
most operations that deal with information displayed on 
the screen. 

4. A descriptor-and-thesaurus system for retrieving 
documents from store. Symbiont I retrieves documents 
with the same searching system it uses in finding passages. 
(A bibliographic designation precedes each document in 
the store of text.) That will be too slow when the store 
becomes large. 

5. A scheme for turning several or many pages at a 
time or for going immediately to a particular page speci- 
fied by page number. 

6. More reliance upon predetermined sequences of 
manipulation and less upon control characters. For exam- 
ple, to underline a segment of text, it should suffice to 
point with the light pen to an “underline” light button, 
then to the beginning of the passage, and then to the end 
of the passage. It is an unnecessary nuisance to have to 
specify “end” after having specified “begin.” However, 
streamlining the procedure in this way will make it neces- 
sary to provide a way of reminding the student when 
he forgets where he is, in a sequence of operations, and 
a way of letting him linger on (or return to) a par- 
ticular operation long enough to correct a mistake in 
specifying it. 

7. Handling of notes precisely as though they were 
documents. Notes will be permitted to contain graphs. 


The note-retrieval glossary will be associated with the 
document-retrieval system. 

8. Acceptance of notes phrased by student. This now 
seems essential even though it is easy for him to record 
verbatim notes. 

9. Provision for extraction from text of individual 
words, individual phrases (delimited by punctuation 
marks), individual sentences, and individual paragraphs 
merely by pointing. It is an unnecessary nuisance to 
underline (i.e., to point to both ends of) a segment unless 
one wants to extract a sequence of characters that does 
not constitute a formal unit. 

10. Labeling of individual curves as well as of families. 

11. Labeling near the curve as an alternative to asso- 
clating label and curve by key. 

12. Search for more than three sets of terms, and for 
other combinations (such as 1 and 2 or 1 and 3) than 
any m of n. 

18. Storage and retrieval of the sets of terms used in 
searching text. It is not good to have to type a set of 
terms more than once, and it will be easy to store them 
for future reference. The student will be able to retrieve 
a set by typing any term in the set. Symbiont IT will 
display all the sets that contain the typed term and let 
the student select the one he wants by pointing to it. 

14. In designating parts of graphs to the program for 
action, more pointing to the parts themselves, and less 
pointing to their names. 

15. Transformation between linear and logarithmic 
coordinates. 

16. Fitting of curves (specified by type, such as sine, 
exponential, and power series) to tabulated numerical 
data, and determination of goodness of fit. 

17. Weighted averaging of curves. 


The present plan is to effect the foregoing improve- 
ments, to gain further experience, and then, in proceeding 
to the third generation of study facilities, to meld them 
with arrangements, not described, to facilitate the organ- 
ization and retrieval of notes and data and the prepara- 
tion of technical papers. For further information about 
the context of the Symbiont system, see reference (1) 
below. 
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